System and method for continuous data analysis of an ongoing clinical trial

ABSTRACT

System and method of continuously analyzing trial data of an ongoing clinical trial is provided. A statistical analysis is performed on a trial database containing subject trial data without suspending the ongoing clinical trial. If the result of the statistical analysis does not exceed a predetermined threshold value, then the statistical analysis is repeated while the clinical trial is ongoing. In a blinded clinical trial, a grouped database is generated from the trial database and a blinding database prior to performing the statistical analysis. The grouped database groups the subject trial data according to the study groups. The ability to continuously monitor and analyze the trial data for statistical significance in tandem with data collection while the trial is ongoing provides many benefits to the researchers because the trial database no longer becomes the bottleneck in obtaining useful results and statistical analysis can be conducted on a near real-time basis without having to wait until completion of the trial.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. application Ser.No. 11/810,483 filed on Jun. 6, 2007, which is a continuation-in-part ofU.S. application Ser. No. 10/667,848 filed Sep. 22, 2003, now abandoned,which is incorporated herein by reference in its entirety.

TECHNICAL FIELD OF THE INVENTION

This application relates to data processing of clinical trial data andmore specifically a system and method for statistically analyzing theclinical trial data.

BACKGROUND OF THE INVENTION

In the United States, the Food and Drug Administration (FDA) overseesthe protection of consumers exposed to health-related products rangingfrom food, cosmetics, drugs, gene therapies, and medical devices. Underthe FDA guidance, clinical trials are performed to test the safety andefficacy of new drugs, medical devices or other treatments to ultimatelyascertain whether or not a new medical therapy is appropriate forwidespread human consumption.

More specifically, once a new drug or medical device has undergonestudies in animals, and results appear favorable, it can be studied inhumans. Before human testing is begun, findings of animal studies arereported to the FDA to obtain approval to do so. This report to the FDAis called an application for an Investigational New Drug (IND).

The process of experimentation is referred to as a clinical trial, whichinvolves four phases. In Phase I, a few research participants, referredto as subjects, (approximately 5 to 10) are used to determine toxicityof a new treatment. In Phase II, more subjects (10-20) are used todetermine efficacy and further ascertain safety. Doses are stratified totry to gain information about the optimal portion. A treatment may becompared to either a placebo or another existing therapy. In Phase III,efficacy is determined. For this phase, more subjects on the order ofhundreds to thousands of patients are needed to perform a meaningfulstatistical analysis. A treatment may be compared to either a placebo oranother existing therapy. In Phase IV (post-approval study), thetreatment has already been approved by the FDA, but more testing isperformed to evaluate long-term effects and to evaluate otherindications.

During clinical trials, patients are seen at medical clinics and askedto participate in a clinical research project by their doctor, known asan investigator. After the patients sign an informed consent form, theyare considered enrolled in the study, and are subsequently referred toas study subjects. A study sponsor, generally considered to be thecompany developing a new medical treatment and supporting the research,develops a study protocol. The study protocol is a document describingthe reason for the experiment, the rationale for the number of subjectsrequired, the methods used to study the subjects, and any otherguidelines or rules for how the study is to be conducted. Prior tousage, the study protocol is reviewed and approved by an InstitutionalReview Board (IRB). An IRB serves as a peer review group, whichevaluates a protocol to determine its scientific soundness and ethicsfor the protection of the subjects and investigator.

Subjects enrolled in a clinical study are stratified into groups thatallow data to be assessed in a comparative fashion. In a common example,one study arm, known as a control group (or “control”), will use aplacebo, whereby a pill containing no active chemical ingredient isadministered. In doing so, comparisons can be made between subjectsreceiving actual medication versus placebo.

Subjects enrolled into a clinical study are assigned to a study arm in arandom fashion, which is done to avoid biases that may occur in theselection of subjects for a trial. For example, a subject who is aparticularly good candidate to respond to a new medication might beintentionally entered into the study arm to receive real medication andnot a placebo. This could skew the data and outcome of the clinicaltrial to favor the medication under study, by the selection of subjectswho are most likely to perform well with the medication. In instanceswhere only one study group is present, randomization is not performed.

Blinding is a process by which the study arm assignment for subjects ina clinical trial is not revealed to the subject (single blind) or toboth the subject and the investigator (double blind). This minimizes therisk of data bias. Virtually all randomized trials are blinded bydefinition. In instances where only one study group is present, blindingis not performed.

Generally, at the end of the trial, the database containing thecompleted trial data is shipped to a statistician for analysis. Ifparticular occurrences, such as adverse events, are seen with anincidence that is greater in one group over another such that it exceedsthe likelihood of pure chance alone, then it can be stated thatstatistical significance has been reached. Using statisticalcalculations, the comparative incidence of any given occurrence betweengroups can be described by a numeric value, referred to as a “p-value”.A p-value of 1.0 indicates that there is a 100% likelihood that anincident occurred as the result of chance alone. Conversely, a p-valueof 0.0 indicates that there is a 0% likelihood that an incident occurredas a result of chance alone. Generally, values of p<0.05 are consideredto be “statistically significant”, and values of p<0.01 are considered“highly statistically significant”.

In some clinical trials, multiple study arms, or even a control group,may not be utilized. In such cases, only a single study group existswith all subjects receiving the same treatment. This is typicallyperformed when historical data about the medical treatment, or acompeting treatment is already known from prior clinical trials, and maybe utilized for the purpose of making comparisons.

The creation of study arms, randomization, and blinding are techniquesthat are used in most clinical trials where scientific rigor is of highimportance. However, these methods lead to several challenges, sincethey prevent the clinical trial sponsor from tracking key informationrelated to safety and efficacy.

Regarding safety, the objective of any clinical trial is to document thesafety of a new treatment. However, in clinical trials whererandomization is conducted between two or more study arms, this can bedetermined only as a result of analyzing and comparing the safetyparameters of one study group to another. Unfortunately, because thestudy arm assignments are blinded, there is no way to separate outsubjects and their data into corresponding groups for purposes ofperforming comparisons while the trial is being conducted. Since manyclinical trials may last for time periods extending for years, it isconceivable to have a treatment toxicity go unnoticed for prolongedperiods without intervention.

Regarding efficacy, any clinical trial seeking to document efficacy willincorporate key variables that are followed during the course of thetrial to draw the desired conclusion. In addition, studies will definecertain outcomes, or endpoints, at which point a study subject isconsidered to have completed the protocol. These parameters, includingboth key variables and study endpoints, cannot be analyzed by comparisonbetween study arms while the subjects are randomized and blinded. Thisposes potential problems in ethics and statistical analysis.

When new medications or other health-related treatments are of superiorefficacy to anything else, it is ethical to allow usage of the treatmentfor those in imminent need, even prior to final government approval.Conversely, when available, it is considered unethical to withhold suchtreatments. For example, if a medication were to be identified thateradicated the Human Immunodeficiency Virus (HIV), it would be unethicalto allow diseased patients to continue suffering and even die of theillness, while the medication was being clinically tested for purposesof government approval. Ideally, in such situations, identification ofeffective treatments should occur early in the project. Under thesecircumstances, non-treatment arms (i.e., those taking placebos) could beconstrued as unethical and should be eliminated. At present, whenclinical trials are randomized and blinded, identification of aparticularly effective treatment may not be realized until the entireclinical trial is completed.

Another related problem is statistical power. By definition, statisticalpower refers to the probability of a test appropriately rejecting thenull hypothesis, or the chance of an experiment's outcome being theresult of chance alone. Clinical research protocols are engineered toprove a certain hypothesis about a medical treatment's safety andefficacy, and disprove the null hypothesis. To do so, statistical poweris required, which can be achieved by obtaining a large enough samplesize of subjects in each study arm. When too few subjects are enrolledinto the study arms, there is the risk of the study not accruing enoughsubjects to enable the null hypothesis to be rejected, and thus notreaching statistical significance. Because clinical trials that arerandomized are blinded, the actual number of subjects distributedthroughout study arms is not defined until the end of the project.Although this maintains data collection integrity, there are inherentinefficiencies in the system, regardless of the outcome.

In a case where the study data reaches statistical significance, asaccrual of subjects continues, and data is received, an optimal time toclose a clinical study would be at the very moment when statisticalsignificance is achieved. While that moment may arrive earlier in thecourse of a clinical trial, there is no way of knowing this, andtherefore time and money are lost. Moreover, study subjects are enrolledabove and beyond what is needed to reach the goals of the study, thusplacing human subjects under experimentation unnecessarily.

In a case where the study data nearly reaches statistical significance,while the study data falls short of statistical significance, there isreason to believe that this is due to a shortage of enrollment in thestudy. Frequently, to develop more supportive data, clinical trials willbe extended. These “extension studies”, however, can only begin after afull closure of the parent study, frequently requiring months to yearsbefore starting again.

In a case where the study data does not reach statistical significance,there is no trend toward significance, and there is little chance ofreaching the desired conclusion. In that case, an optimal time to closea study is as early as possible once the conclusion can be establishedthat the treatment under investigation does not work, and study data haslittle chance of reaching statistical significance (i.e., it is futile).In randomized and blinded clinical trials, this conclusion is difficultto arrive at until data analysis can be conducted. In these situations,time and money are lost. Moreover, an excess of human subjects areplaced under study unnecessarily.

To mitigate some of the risks related to the conduct of randomized andblinded clinical trials, a Data Safety Monitoring Board (DSMB) may beformed at the beginning of each protocol. In general, a DSMB isrecommended for clinical trials that involve a potentially seriousoutcome (e.g., death, heart attack, etc.), are randomized and blinded,and extend for prolonged periods of time. In addition, a DSMB isrequired for trials that are sponsored by the United States government,namely, the National Institute of Health (NIH).

A DSMB generally consists of members who are domain experts in the fieldof study, such as physicians, as well as bio-statisticians. It isimportant that DSMB members be separate from personnel of the sponsororganization, and financial disclosure for all members is performed tominimize conflicts of interest. Prior to start of a clinical trial,standard operating procedures are established for the DSMB, includingthe frequency of meetings, initiation of interim analyses, conductduring interim analyses and criteria for discontinuation of the clinicaltrial. As it relates to the safety of study subjects, DSMB functions toexamine trends of adverse occurrences rather than investigate specificreports, which are generally left to each IRB responsible for theactivities of any given investigator. That is, DSMB receives only asnapshot data of a clinical trial and not a continuous analysis of trialdata as with the present invention. Additionally, if dangerousconditions/events (e.g., deaths of study patients) are detected then theclinical trial must be suspended/interrupted to perform data analysis ofthe clinical trial. Further, DSMB cannot determine whether suchdangerous conditions exist with the control group taking the placebo orthe study group taking the drug under study without suspending theclinical trial. That is, the snapshot data is not sufficient for DSMB todetermine the cause of the dangerous condition. Accordingly, DSMB'sspecificity and sensitivity of detecting dangerous condition is very lowbecause it cannot determine whether the dangerous condition is relatedto the drug under study. Therefore the present invention proceeds uponthe desirability of resolving this problem by increasing the sensitivityto such dangerous conditions by performing continuous data analysiswithout interrupting the clinical trial.

A typical method of collecting and analyzing patient data is illustratedin the flow chart shown in FIG. 1. Patient data or charts 10 from theclinical trial are collected manually in paper forms. Using a technologycalled Electronic Data Capture (EDC) or Remote Data Entry (RDE), acomputer (not shown) displays a Case Report Form (CRF) to a clinicalresearch coordinator (CRC) 12, typically a nurse or doctor. The CRC 12then enters the patient data 10 through the computer display which isreceived in block 14 by an EDC system which executes all of the stepsincluded in a box 11. The received data is stored in a clinical trialdatabase 38 through a link 20 which can be an electronic link such as atelephone line or Internet link. In block 18, it is determined whetherthe data inputted by the CRC 12 is clean using one or more rules. Therules may be implemented by simple range checking scripts, or by aninference rule engine or deterministic rule engine in order to identifypotential problems with the data.

In addition to the software programs, block 18 may also involve researchpersonnel known as monitors or Clinical Research Associates (CRA) whotravels to the various research sites to perform source documentverification (SDV) whereby the data in the database 38 is reconciledagainst individual patient charts to the degree required in theprotocol.

If it is determined that the data entered is not clean, then block 22generates a query which is then sent over the link 20 to the CRC 12. Theblocks 14, 18 and 22 are repeated until all of the subject data 10 areentered. This is an iterative process that continues until resolution ofall queries in the database 38.

Once all data 10 are entered, block 24 determines whether the clinicaltrial is over. If no, then the EDC system continues to receive thepatient trial data 10 through block 14 as the trial continues. If thetrial is over, control passes to block 26 where the entire database islocked from any changes, deletions or insertions of the data in thedatabase 38. In one embodiment, locking involves turning the database 38into a “read-only” state.

In block 28, a blinding data from a blinding database is retrieved. Asimplified example blinding database 40 is shown in FIG. 4. The blindingdatabase 40 is a database table having two columns. The first columncontains a patient subject ID (subject identifier) and the second columncontains an associated study arm or group the patient belongs to. In thetable 40, 13 subjects belong to Study Arm “A” and 12 subjects belong toStudy Arm “B”. Because the database 40 is not associated with actualtrial data, the table 40 by itself is relatively uninformative.

A simplified example trial database 38 is shown in FIG. 5. Theembodiment shown is a database table containing two columns. The firstcolumn contains a patient subject ID and the second column is a databasefield called “Heart Attack” which specifies whether the subject had aheart attack. An entry of 0 means NO and entry of 1 means YES. As can beseen from the trial database 38, due to blinding of the subjects in thestudy groups, there is no way of knowing whether or not any discrepancyexists in the number of heart attacks seen in Group A versus B. Becausethe trial is randomized, without the blinding data 40, the table 38 byitself is relatively uninformative.

In block 28, an unblinded database is produced from the trial database38 and the retrieved blinding database 40 in which the subject ID isused as a common key. The result of the unblinding process of block 28is shown in FIG. 6 as the unblinded database 41. In the embodimentshown, one database table is produced. The table 41 contains subjectidentifiers, Study Arm of the subjects, and Heart Attack data of thosesubjects. As can be appreciated by a person of ordinary skill in theart, there is a direct traceability from study data and subject ID toStudy Arm.

In block 30, statistical analysis is performed on the unblinded data 42to find out the efficacy and safety of the completed clinical trial.

During the course of any given randomized and blinded clinical trial, aninterim analysis may be conducted. An interim analysis may result fromurging of the DSMB for cause, or be a pre-planned event as described inthe study protocol.

Conducting an interim analysis involves a process where the availabledata is verified and cleaned. The clinical trial is typicallyinterrupted or suspended to enable the available data to be verified andcleaned. The verification process generally involves a process by whichtrained personnel travel to the various research sites to reconcilesubmitted data against source documents, which generally implies thepatient's chart, laboratory reports, radiographic readings, and others.The data cleaning process may involve a series of documentedcommunications between the research site and a central data coordinatingpersonnel to resolve inconsistencies or other conflicting data.

The refined database must then be sent to an impartial third party forstatistical analysis. To conduct the analysis, the statistician mustun-blind the clinical trial database by combining both the study datawith the blinding key of which subjects are assigned to particular studyarms. Since the clinical study is expected to continue beyond theinterim analysis, the process of un-blinding must be conducted withgreat caution, so as not to reveal the blind status of subjects to anypersonnel involved in the execution of the clinical trial. Once astatistician has completed the interim analysis, a report is issued tothe trial sponsor and DSMB.

Inclusive of the data cleaning, verification, un-blinding andstatistical analysis processes, as well as the administrative resourcesfor coordinating several groups of personnel for the un-blindingprocess, an interim analysis is often arduous, time-consuming andexpensive.

In spite of the latest technological advancements made in the area ofdata collection through electronic systems, there is still adisadvantage in that it is very difficult to draw conclusions about amedical treatment while the data is being collected during the trial.This limitation stems primarily from the fact that statistical analysiscannot begin until the trial data has been fully cleaned and processed.At present, statistical analysis can only be conducted upon data in an“en bloc” fashion. This creates a situation where the ability to drawconclusions about a medical therapy inevitably lags behind the processof simply obtaining data in a database.

Regardless of how efficient the data collection process may be madethrough automation, the ability to acquire the information needed forcritical decision-making is still suspended by the requirement to obtaina locked database in order for statistical work to advance.

Therefore, it is desirable to provide a method and system for conductingdata analysis, i.e., statistical analysis, on the clinical datacollected while the clinical trial is ongoing. This advantageouslypermits the present invention to identify positive or negativeconditions/events/trends much more rapidly than possible with currentlyavailable systems and methods.

In the case of a randomized clinical trial where maintainingconfidentiality is important, it is also desirable to provide a securesystem in which the blinding information is integrated in such a waythat the clinical trial data and blinding data are stored securely toprevent users from accessing the data and yet allow the execution ofprograms for performing statistical comparisons between study arms whilethe clinical trial is ongoing.

SUMMARY OF THE INVENTION

It is an object of the present invention to provide a system and methodfor continuously analyzing trial data of an ongoing clinical trial.

Another object of the present invention is to provide the system andmethod as aforesaid which analyzes the trial data without interruptingor suspending the ongoing clinical trial.

A further object of the present invention is to provide the system andmethod as aforesaid which performs statistical analysis on the trialdata and repeatedly performs such statistical analysis until the resultof such statistical analysis or the rate of change in a predeterminedstatistical parameter exceeds a predetermined threshold.

In accordance with an embodiment of the present invention, the systemand method continuously analyzes trial data of an ongoing clinicaltrial. The system and method accesses the subject trial data from atrial database and performs statistical analysis on the accessed trialdata. The system and method repeatedly performs the statistical analysiswhile the clinical trial is on going if it determines that result of thestatistical analysis or the rate of change of the statistical parameterdoes not exceed a predetermined threshold value.

In accordance with an exemplary embodiment of the present invention, amethod of continuously analyzing trial data of an ongoing clinicaltrial, comprises the steps of: accessing a trial database comprisingtrial data of subjects in the ongoing clinical trial; performingstatistical analysis on the trial data to determine a parameter ofstatistical significance without suspending the ongoing clinical trial,determining whether the result of the statistical analysis exceeds athreshold value, and repeating the steps of accessing, performing anddetermining during the ongoing clinical trial if it is determined thatthe results of the statistical analysis does not exceed the thresholdvalue.

In accordance with an exemplary embodiment of the present invention, acomputer readable media comprising a code for continuously analyzingtrial data of an ongoing clinical trial. The code comprises instructionsfor accessing a trial database comprising trial data of subjects in theongoing clinical trial, performing statistical analysis on the trialdata to determine a parameter of statistical significance withoutsuspending the ongoing clinical trial, determining whether the result ofthe statistical analysis exceeds a threshold value, and repeating thesteps of accessing, performing and determining during the ongoingclinical trial if it is determined that the result of the statisticalanalysis does not exceed the threshold value.

In accordance with an exemplary embodiment of the present invention, asystem a for continuously analyzing trial data of an ongoing clinicaltrial comprises a trial database comprising trial data of subjects inthe ongoing clinical trial and a processor. The processor performsstatistical analysis on the trial data to determine a parameter ofstatistical significance without suspending the ongoing clinical trialand determines whether the result of the statistical analysis exceeds athreshold value. The processor is operable to repeatedly access, performand determine during the ongoing clinical trial if it is determined thatthe result of the statistical analysis does not exceed the thresholdvalue.

In accordance with an embodiment of the present invention, the systemand method uses a user definable criteria that defines the level ofcleanliness of subject data for statistical analysis. In that case, onlythose subject data that meet the user defined criteria are selected fromthe trial database for statistical analysis.

In accordance with an embodiment of the present invention, the systemand method uses a rate of change for a given statistical parameter. Therate of change may be a user definable criteria including the parameterto be measured and the time interval through which the parameter haschanged.

In accordance with an embodiment of the present invention, the systemand method uses a value for the degree of disparity in any givenstatistical evaluation between two groups of a multi-arm clinical study.The value for the degree of disparity may be a user definable criteriaincluding the parameter to be measured.

In accordance with an embodiment of the present invention, the systemand method waits for a predetermined time period before repeating thestatistical analysis if the result of the statistical analysis does notexceed the threshold value. This is done so additional subject data canbe collected and added to the trial database.

In accordance with an embodiment of the present invention, the clinicaltrial is blinded. Accordingly, the system and method of the presentinvention accesses a blinding database in addition to the trial databaseto obtain additional information, such as subject identifiers andassociated study group identifiers. Each study group identifieridentifies which study group a particular subject belongs to. The systemand method of the present invention generates a grouped database fromthe clinical database and the blinding database for statistical analysisin which the trial data is grouped according to the subject's studygroup. Preferably, the system and method generates a data table for eachstudy group and contains trial data associated with all of the subjectsthat belong to that study group.

In accordance with an embodiment of the invention, the system and methodstores the unblinded database in a memory device that is inaccessible byany user in order to preserve the blindness of the clinical trial. Theunblinded database is physically part of the trial database and/orelectronic data collection (EDC) system of the present invention, butlogically separated by user access-permission. Alternatively, theunblended database can be physically separate from the trial database.

In accordance with an embodiment of the invention, the system and methodperforms the statistical analysis without locking the trial database.

In accordance with an embodiment of the invention, if the result of thestatistical analysis or the rate of change of a statistical parameterexceeds the threshold value, a user is alerted. The predeterminedthreshold value may include a predetermined statistical significancevalue or a rate of change.

In accordance with an embodiment of the invention, the system and methodoffers many statistical models to users to choose from. The system andmethod retrieves and runs a user selected statistical model on theclinical trial database.

In accordance with an embodiment of the invention, the system and methodgraphically presents the statistical parameters and their trends overtime to end-users with the correct permission level.

In accordance with an embodiment of the present invention, the systemand method enables the user to adjust the distribution of the subjectswithin the blinding table for future enrollees to be grouped in aparticular manner.

Various other objects, advantages, and features of the present inventionwill become readily apparent from the ensuing detailed description, andthe novel features will be particularly pointed out in the appendedclaims.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description, given by way of example, and notintended to limit the present invention solely thereto, will best beunderstood in conjunction with the accompanying drawings in which:

FIG. 1 is a flow diagram of a method of collecting and analyzingclinical trial data using an EDC system;

FIG. 2 is a functional block diagram of a clinical trial managementsystem in accordance with an exemplary embodiment of the presentinvention;

FIG. 3 is a flow diagram of a software routine that continuouslyanalyzes the trial data while the clinical trial is ongoing inaccordance with an exemplary embodiment of the present invention;

FIG. 4 is an example of a blinding database;

FIG. 5 is an example of a trial database containing subject trial data;

FIG. 6 is an example of an unblinded database derived from the blindingdatabase of FIG. 4 and the trial database of FIG. 5 in accordance withan exemplary embodiment of the present invention;

FIG. 7A is an example of a trial database containing a status field thatrepresents the levels of cleanliness of the subject data records inaccordance with an exemplary embodiment of the present invention;

FIG. 7B is a filtered trial database containing a subset of the trialdatabase of FIG. 7A which have been selected as a function of a userspecified status in accordance with an exemplary embodiment of thepresent invention; and

FIG. 8 is an example of a grouped database derived from the blindingdatabase of FIG. 4 and the filtered trial database of FIG. 7B inaccordance with an exemplary embodiment of the present invention.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The present application is applicable to any clinical studies utilizingelectronic data collection, including but not limited to collectingclinical data over a network from a plurality of trial participants.Clinical studies can involve multiple groups to enable comparisons to bemade between subjects receiving the actual medication versus placebo.Also, clinical studies can involve a single study group, wherein datacollected from the clinical trial can be compared to data from othersimilar clinical trials or studies, or to historical data. Althoughrandomization or randomized studies lend themselves to clinical trialswith multiple study groups, it is not necessary for clinical trials withsingle study group. It is appreciated that randomization is notnecessary required for clinical trials with multiple study groups, thestudy subjects can request assignment to a particular study group orarm, or can be assigned to a particular study group at the discretion ofthe investigator.

Most clinical trials utilize multiple study arms, randomization, andblinding so they can maintain scientific rigor and conform torequirements of FDA approval for a new drug and/or new treatment.However, these rigorous clinical trials can lead to several challenges,since they prevent the clinical trial sponsor from tracking keyinformation related to safety and efficacy while the clinical trial isongoing. Such analysis and comparisons are not possible with the priorart systems and methods until the clinical trials have ended or unlessthe clinical trials are interrupted/suspended. That is, it isconceivable that a potential positive or negative effects of a drugunder study will go unnoticed until the clinical trial is completed. Thepresent invention proceeds upon desirability of making such positive ornegative information available while the clinical trial is ongoing andwithout interrupting the clinical trial.

Even in non-randomized, single arm, and unblinded clinical studies,little if any statistical conclusion can be drawn until after the trialdatabase comprising the trial data can be cleaned of errors and locked.Current operational methods call for paper forms that are filled out byinvestigators and later keyed into a database. This latter activity maylag behind the paper forms for many days to weeks, ultimatelycompounding the delays seen in a clinical trial that may involvehundreds of thousands of forms. The delay in transforming data from apaper form to an electronic format further delays the analysis ofclinical study data because data cleaning operation cannot start untilthe data is in electronic format. The present invention resolves theseissues by performing these data analysis on trial data while theclinical trial is ongoing and without interrupting the clinical trial.

Turning now to FIG. 2, there is shown a clinical trial data managementsystem 100 in accordance with an exemplary embodiment of the presentinvention which is an Internet-enabled application solution frameworkthat automates data collection, data cleaning, grouping if needed (aswill be explained more fully later herein) and statistical data analysiswhile the trial is ongoing. The system 100 is connected to a computernetwork such as the Internet 120 through, for example, an I/O interface102, which receives information from and sends information to Internetusers over a communication link 20 and to one or more operators using awork station 117. The Internet users are typically clinical researchcoordinators (CRC's) located at various trial sites who transcribe thesubjects' charts to the system 100. The system 100 comprises, forexample, memory 104 which is volatile, processor (CPU) 106, programstorage 108, and data storage device 118, all commonly connected to eachother through a bus 112. The program storage 108 stores, among others, aclinical trial analysis program or module 114 and one or moremathematical models 116 that are used to analyze the subject data andobtain the p-value for statistical significance. The data storage device118 stores a clinical trial database 38 and blinding database 40. Anexample of a clinical trial database 38 is shown in FIG. 5 and anexample of a blinding database 40 is shown in FIG. 4. The clinical trialdatabase 38 and the blinding database 40 can be separated eitherphysically or logically through user access-permission. The blindingdatabase 40 can be accessed manually by permitted users or by the system100 to establish grouping assignments for upcoming or future enrollmentinto the clinical trial. It is appreciated that some clinical studiesare performed without the use of blinding. In such instances, the systemdoes not utilize the blinding database 40 because the study patients andor their investigators are informed of what therapy is being receivedand or administered respectively. Any of the software program modules inthe program storage 108 and data from the data storage 110 aretransferred to the memory 104 as needed and is executed by the processor106.

The system 100 can be any computer such as a WINDOWS-based or UNIX-basedpersonal computer, server, workstation, minicomputer or a mainframe, ora combination thereof. While the system 100 is illustrated as a singlecomputer unit for purposes of clarity, persons of ordinary skill in theart will appreciate that the system may comprise a group of computerswhich can be scaled depending on the processing load and database size.

FIG. 3 illustrates a flow diagram of a software routine 50 thatcontinuously analyzes the trial data while the trial is ongoing inaccordance with an exemplary embodiment of the present invention. Theroutine 50 is stored in the storage device 108 and works with the EDCsystem 11 of FIG. 1 while the system 11 continuously collects and cleansthe trial data in accordance with an exemplary embodiment of the presentinvention.

The routine 50 connects to a trial database 56 through a log-inprocedure at block or step 52. A simplified exemplary trial database 56is shown in FIG. 7A. The trial database 56 contains three columnscomprising a patient subject ID field, a data status field, whichspecifies the level of cleanliness, and a “Heart Attack” field similarto FIG. 5.

FIG. 7A illustrates simplified trial data records that are at differentlevels of cleanliness. In the example shown in FIG. 7A, there are fivelevels of status. Level 1 indicates that there is an outstanding querythat needs to be answered by the CRC 12 (see step 22 in FIG. 1). Level 2indicates that the record is pending a review by another reviewer suchas the sponsor of the trial. Level 3 indicates that it is pending areview by a clinical research associate (CRA) to travel to a researchsite to perform what is known as a source document verification (SDV).This typically involves a verification of the trial record with anactual patient chart. Level 4 indicates that it is pending a lockbarring any intervention by any reviewer. Finally, Level 5 indicatesthat the record is locked which represents the highest level of cleandata.

In the “Heart Attack” field, an entry of 0 means NO and entry of 1 meansYES. The “Heart Attack” field also includes some erroneous data such as“don't know” for subject 118 or “Y” for subject 107. Accordingly, thestatus for those records indicates a “1” in which queries areoutstanding.

Once connected, the routine 50 retrieves a user specified criteria 54stored in the storage device 108 which specifies the status or level ofcleanliness of the trial database at step 60 and retrieves the trialdatabase 56 which is filtered for those database records that satisfythe retrieved criteria at step 61. For an example, if the retrieved userspecified criteria is 3, the routine 50 selects only those records thathave a status of 3 or better at step 61 Such a filtered database 58 isshown in FIG. 7B. While the database 58 has a relatively higher level ofcleanliness, it does have a fewer number of records. This is usefulsince, at any given point in time during the data collection process,the clinical trial database 56 may have data that has any combination ofdata pending SDV, containing outstanding queries, completed SDV butawaiting lock, and so on. Depending upon the operating proceduresdefined for any such clinical trial, only certain subsets of data may besuitable for inclusion in an analysis.

Once the trial database 58 is filtered according to the user specifiedcriteria at step 61, the routine 50 retrieves the blinding data, such asthose exemplary blinding data shown in FIG. 4, from the blindingdatabase 40 at step 62. The routine 50 utilizes the filtered trialdatabase 58 and the blinding database 40 to produce a grouped database42, such as exemplary shown in FIG. 8, at step 64. In accordance with anexemplary embodiment of the present invention, two database tables 66,68, one for each study group without identifying subjects, are producedfor example, as shown in FIG. 8. One table 66 groups the Heart Attackdata of subjects that belong to a control group (Study Arm A) while theother table 68 groups the Heart Attack data of subjects that belong to anon-control group (Study Arm B). As can be appreciated by person ofordinary skill in the art, there is no way to trace the origins of anygiven data point in either table 66 or table 68, to its originalsubject, and therefore either table, by itself, is relativelyuninformative. Taken together, however, note that there seems to be alot more heart attacks occurring in Study Arm B.

Turning now to FIG. 3, there is illustrated a process of continuouslyanalyzing trial data of an ongoing randomized clinical trial inaccordance with an exemplary embodiment of the present invention. Thesystem 100 maintains the clinical trial database 38 and the blindingdatabase 40 as separate physical and digital entities, in order tomaintain their distinct nature. Alternatively, the distinction betweenthe two databases could be logical rather than physical, based upon useraccess-permission. In other words, the trial data and blind data remainas two separate data tables and no table is created containing all ofthe following information: the subject identifier, study group and heartattack status. Furthermore, system communication with the blindingdatabase table occurs only by virtue of the machine programs of thepresent invention executing specified actions to sort the clinical trialdata in accordance with an exemplary embodiment of the presentinvention. The clinical trial data is preferably segregated into genericpools of data and remains de-identified or unlinked to both the subjectand the study arm, and thus indecipherable from the standpoint of theability to trace a particular data item back to a specific subject.

The routine 50 retrieves a user defined analysis method 72 stored in thestorage device 108 and retrieves the method from the mathematical models116 stored in the storage device. The routine 50 of the presentinvention runs the model to analyze the grouped database 42 at step 70.Preferably, the routine 50 of the present system and method obtains aparameter of statistical significance, e.g., a p-value (a statisticalsignificance of the safety and efficacy of the unblinded database 41) atstep 76. An exemplary unblinded database 41 in accordance exemplaryembodiment of the present invention is shown in FIG. 6. It isappreciated that the mathematical model can include one or moreformulas, representing mathematical calculations, whereby one or morevariables in the clinical trial database are identified, and numericresult can be obtained. Such formulas can include calculations of: mean,median, mode, range, average deviation, standard deviation, andvariance. In addition, an administrator can enter mathematical formulasto further analyze the data to make comparisons between groups of data,as defined by the study arms, to determine statistical metrics andsignificance by methods including Chi-square analysis, t-test, f-test,one-tailed test, two-tailed test, and Analysis of Variance (ANOVA). Inaccordance with an aspect of the present invention, the system andmethod stores these calculated statistical values and associated timepoints, thereby allowing the present system and method to track trends,such as a rate of change.

Once the mathematical analysis is completed, the routine retrieves auser-defined p-value 74 stored in the storage device 108 at step 76. Theroutine 50 then determines whether the derived p-value exceeds theretrieved user defined p-value at step 78. As discussed in detailherein, a typical user defined p-value can be 0.05 indicating that thedifference between the control group and non-control group isstatistically significant. Thus, if the derived value is less than 0.05,then the inquiry at step 78 is answered in the affirmative and theroutine 50 sends an alert to the user or operator without displaying theactual output value(s) at step 80. The alert can be in the form of aflashing display, alarm, a change in the system output display to theuser by virtue of color-coding, fonts, icons or text, or an automatedsystem generated message to the user by way of email, facsimile,telephone or pager.

Alternatively, in accordance with an exemplary embodiment of the presentinvention, the routine 50 can retrieve a user-defined rate of changevalue for a given statistical parameter at step 76. The value of therate of change can be positive or negative number, or any indication ofpositivity or negativity in the rate of change. A negative rate ofchange in the p-value can indicate a lack of efficacy in a particularstudy arm, and the routine 50 can establish this negative rate of changein the p-value as a trigger for alerting the user or operator at step78. Thus a negative rate of change in the p-value would result in theinquiry at step 78 being answered in the affirmative resulting in theroutine 50 sending an alert to the user or operator at step 80.

In accordance with an exemplary embodiment of the present invention, theroutine 50 retrieves a user-defined difference in a statistical valuebetween two study arms of the clinical study at step 76. Such differencecan signify a divergence in statistical trends between the study arms ofa clinical study and the routine 50 can establish this difference in thestatistical value as a trigger for alerting the user or operator at step78. Thus if the degree of disparity between the two study arms of aclinical study exceeds a user-defined value, then the inquiry at step 78is answered in the affirmative and the routine 50 sends an alert to theuser or operator at step 80.

In accordance with an exemplary embodiment of the present invention, atstep 82, the routine 50 can generate and display output in accordancewith the user defined output mode 84, as the generic data tables 66, 68generated at step 64. The output data can take various formats includingplain text, American Standard Code for Information Interchange (ASCII),and SAS. Where appropriate, this allows the user to perform customizedstatistical analysis using the present invention to be performed. It isappreciated that these outputs can also be integrated with othersoftware packages to generate customized graphical reports.

In accordance with an exemplary embodiment of the present invention, ifthe trial is a randomized clinical trial, then the routine 50 stops atstep 80 and provides a Boolean output as to whether or not a particularstudy parameter has reached the desired level of statisticalsignificance or not at step 80. The routine 50 skips step 82 or makes itavailable only to a select group of users based upon access-permission.It is appreciated that this functionality or accessibility can bedetermined by an administrative user as a configurable aspect of thepresent system and software. This advantageously maintains the blindinginformation as secure as possible, thereby minimizing any inference thatcan be made about the study arm of any given subject. In monitoring theexact numeric determination of statistical significance for any givenclinical trial variable, it is conceivable that the accession of newdata could cause statistical metrics for a particular study arm tochange in such a manner that inference can be made regarding theblinding status of the subject whose data was most recently added, thuscompromising statistical veil.

It is appreciated that even in non-randomized clinical trials, thedisplay of specific numeric value corresponding to a parameter ofstatistical significance by the routine 50 at step 80 is useful andbeneficial. Since there is no blinding information to protect innon-randomized clinical trials, the display of such parameter ofstatistical significance can be offered as a second mode of operation bythe present invention. Alternatively, the present invention can providea third mode of operation, whereby numeric ranges of statisticalsignificance can be defined into groups that can be displayed to theuser of the present invention.

However, if the derived p-value is higher than the user-defined p-value,the rate of change is positive or the degree of disparity between thestudy arms does not exceed the user defined threshold value, then theinquiry at step 78 is answered in the negative and the routine 50proceeds to step 86. The routine 50 waits a predetermined time soadditional clinical data is collected and stored in the clinical trialdatabase 58 at step 86 and proceeds to step 52. The routine 50 thenrepeats the process of analyzing the trial data of an ongoing clinicaltrial. In other words, the system 100 is active throughout the datacollection phase of the clinical trial, sending alerts when keyparameters reach the pre-set or predetermined level of statisticalmeasure or significance.

As can be appreciated by persons of ordinary skill in the art, theability of the present clinical trial system 100 to continuously andconfidentially monitor and analyze the trial data for statisticalsignificance in tandem with data collection while the trial is ongoingis a tremendous benefit to the researchers. The trial database no longerbecomes the bottleneck in obtaining useful results and statisticalanalysis can be conducted on a near real-time basis.

This continuous near real-time statistical analysis feature in turn hasfar reaching implications. Specifically, by providing researchers withan early indication of the clinical trial, the present inventionshortens the time frame required to reach critical decisions about a newmedical therapy. Still another advantage is that the present systemimproves patient safety by setting thresholds for triggering alerts foradverse events. A related advantage is that a futile trial can be endedearly, thereby saving the substantial cost of conducting the trial.Conversely, for a successful medical treatment, a trial can be endedearly or the placebo arm can be eliminated. Based upon statisticaltrends, the distribution of enrollees can be altered while the clinicaltrial is ongoing in order to adjust or better test the objectives of theclinical trial or scientific hypothesis. The present invention alsoprovides the ability to more accurately identify the need to perform afull-scale interim analysis.

In accordance with an exemplary embodiment of the present invention, acomputer readable media comprising a code for continuously analyzingtrial data of an ongoing clinical trial. The code comprises instructionsfor accessing a trial database comprising trial data of subjects in theongoing clinical trial, performing statistical analysis on a trial dataof statistical significance without suspending the ongoing clinicaltrial, determining whether the result of the statistical analysisexceeds a threshold value, and repeating the steps of accessing,performing and determining during the ongoing clinical trial if it isdetermined that the result of the statistical analysis does not exceedthe threshold value. Further the code comprises instructions forselecting only those subject data that meets the user defined criteriafrom the trial database for statistical analysis. The user definedcriteria defining the level of cleanliness of the subject data forstatistical analysis.

In accordance with an exemplary embodiment of the present invention, asystem a for continuously analyzing trial data of an ongoing clinicaltrial comprises a trial database comprising trial data of subjects insaid ongoing clinical trial and a processor. The processor performsstatistical analysis on the trial data to determine a parameter ofstatistical significance without suspending the ongoing clinical trialand determines whether the result of the statistical analysis exceeds athreshold value. The processor is operable to repeatedly access, performand determine during the ongoing clinical trial if it is determined thatthe result of the statistical analysis does not exceed the thresholdvalue.

Various omissions, modifications, substitutions and changes in the formsand details of the device illustrated and in its operation can be madeby those skilled in the art without departing in any way from the spiritof the present invention. Accordingly, the scope of the invention is notlimited to the foregoing specification, but instead is given by theappended claims along with their full range of equivalents.

1. A computer readable storage medium comprising computer executableinstructions that, when executed by a processor, cause the processer to:calculate, during at least one time interval of an ongoing clinicaltrial, a statistical term of interest associated with a parameter intrial data of at least one arm of the ongoing clinical trial; anddetermine, in near-real time, whether the calculated statistical term ofinterest at the at least one time interval indicates the occurrence of astatistically significant event in the ongoing clinical trial.
 2. Thecomputer readable storage medium of claim 1, further comprisinginstructions that, when executed by the processor cause the processor toiteratively calculate and determine based on whether the statisticallysignificant event has occurred.
 3. The computer readable storage mediumof claim 1, further comprising instructions that, when executed by theprocessor, cause the processor to retrieve the trial data for an ongoingclinical trial from a plurality of records maintained in at least onedata storage device.
 4. The computer readable storage medium of claim 3,further comprising instructions that, when executed by the processorcause the processor to retrieve the plurality of records having apredetermined level of cleanliness that satisfies a user definedcleanliness criteria.
 5. The computer readable storage medium of claim3, wherein the plurality of records include at least two of a subjectidentifier, an arm identifier, a value associated with said parameter,and a cleanliness value.
 6. The computer readable storage medium ofclaim 3, further comprising computer executable instructions that, whenexecuted by a processor, cause the processer to unblind the retrievedtrial data; and group the retrieved trial data into a plurality of arms.7. The computer readable storage medium of claim 6, wherein saiddetermined statistical term of interest is determined by comparing afirst parameter of a first arm to a second parameter of a second arm ofthe ongoing clinical trial.
 8. The computer readable storage medium ofclaim 1, wherein said statistically significant event represents one ofefficacy of the ongoing clinical trial, safety of the ongoing clinicaltrial, and level of participation in said at least one arm of theongoing clinical trial.
 9. The computer readable storage medium of claim1, wherein instructions to determine further cause the processor todetermine a rate of change in said determined statistical term ofinterest between at least two time intervals during the ongoing clinicaltrial.
 10. The computer readable storage medium of claim 1, furthercomprising instructions that, when executed by the processor cause theprocessor to generate an alert based on the determination that saidstatistically significant event has occurred during the ongoing clinicaltrial.
 11. A system to analyze trial data of a multi-arm study in anongoing clinical trial, comprising: a processing device configured to:calculate, during at least one time interval of the ongoing clinicaltrial, a statistical term of interest associated with a parameter intrial data of at least one arm of the ongoing clinical trial; anddetermine, in near-real time, whether the calculated statistical term ofinterest at the at least one time interval indicates the occurrence of astatistically significant event in the ongoing clinical trial.
 12. Thesystem of claim 11, wherein the system is further configured toiteratively calculate and determine based on whether the statisticallysignificant event has occurred.
 13. The system of claim 11, wherein theprocessor is further configured to retrieve the trial data for anongoing clinical trial from a plurality of records maintained in atleast one storage device.
 14. The system of claim 13, wherein theprocessor is further configured to retrieve the plurality of recordshaving a predetermined level of cleanliness that satisfies a userdefined cleanliness criteria.
 15. The system of claim 13, wherein theplurality of records include at least two of a subject identifier, anarm identifier, a value associated with said parameter, and acleanliness value.
 16. The system of claim 13, wherein the processer isfurther configured to unblind the retrieved trial data; and group theretrieved trial data into a plurality of arms.
 17. The system of claim16, wherein said determined statistical term of interest is determinedby comparing a first parameter of a first arm to a second parameter of asecond arm of the ongoing clinical trial.
 18. The system of claim 11,wherein said statistically significant event represents one of efficacyof the ongoing clinical trial, safety of the ongoing clinical trial, andlevel of participation in said at least one arm of the ongoing clinicaltrial.
 19. The system of claim 11, wherein the processor is furtherconfigured to determine a rate of change in said determined statisticalterm of interest between at least two time intervals during the ongoingclinical trial.
 20. The system of claim 11, wherein further theprocessor is further configured to generate an alert based on thedetermination that said statistically significant event has occurredduring the ongoing clinical trial.
 21. A method comprising: calculatingin a computing device, during at least one time interval of the ongoingclinical trial, a statistical term of interest associated with aparameter in trial data of at least one arm of the ongoing clinicaltrial; and determining in the computing device, in near-real time,whether the calculated statistical term of interest during the at leastone time interval indicates the occurrence of a statisticallysignificant event in the ongoing clinical trial.
 22. The method of claim21, further comprising iteratively calculating and determining based onwhether the statistically significant event has occurred.
 23. The methodof claim 21, further comprising retrieving the trial data of the ongoingclinical trial from trial records maintained in at least one datastorage device.
 24. The method of claim 23, wherein retrieving the trialdata includes retrieving a portion of the trial records having a levelof cleanliness that satisfies a predetermined criterion.
 25. The methodof claim 23, wherein a trial record includes at least two of a subjectidentifier, an arm identifier, a parameter value, and a cleanlinessvalue.
 26. The method of claim 21, further comprising: unblinding thetrial data; and grouping the unblinded trial data into a plurality ofarms.
 27. The method of claim 26, wherein said determining thestatically term of interest includes comparing a first parameter of afirst arm to a second parameter of a second arm of the ongoing clinicaltrial.
 28. The method of claim 21, wherein said statisticallysignificant event represents one of efficacy of the ongoing clinicaltrial, safety of the ongoing clinical trial, and level of participationin said at least one arm of the ongoing clinical trial.
 29. The methodof claim 21, further comprising determining a rate of change in saiddetermined statistical term of interest between at least two timeintervals during the ongoing clinical trial.
 30. The method of claim 21,further comprising generating an alert based on the determination thatsaid statistically significant event has occurred during the ongoingclinical trial.